Overview

Dataset statistics

Number of variables13
Number of observations2500
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory495.4 KiB
Average record size in memory202.9 B

Variable types

Numeric12
Categorical1

Alerts

area is highly overall correlated with perimeter and 4 other fieldsHigh correlation
perimeter is highly overall correlated with area and 7 other fieldsHigh correlation
major_axis_length is highly overall correlated with area and 8 other fieldsHigh correlation
minor_axis_length is highly overall correlated with area and 7 other fieldsHigh correlation
convex_area is highly overall correlated with area and 4 other fieldsHigh correlation
equiv_diameter is highly overall correlated with area and 4 other fieldsHigh correlation
eccentricity is highly overall correlated with major_axis_length and 5 other fieldsHigh correlation
roundness is highly overall correlated with perimeter and 8 other fieldsHigh correlation
aspect_ration is highly overall correlated with perimeter and 7 other fieldsHigh correlation
compactness is highly overall correlated with perimeter and 7 other fieldsHigh correlation
class is highly overall correlated with perimeter and 6 other fieldsHigh correlation
solidity is highly overall correlated with roundnessHigh correlation
extent is highly overall correlated with roundness and 2 other fieldsHigh correlation

Reproduction

Analysis started2022-12-06 23:03:58.603231
Analysis finished2022-12-06 23:04:36.261762
Duration37.66 seconds
Software versionpandas-profiling vv3.5.0
Download configurationconfig.json

Variables

area
Real number (ℝ)

Distinct2424
Distinct (%)97.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean80658.221
Minimum47939
Maximum136574
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size19.7 KiB
2022-12-06T18:04:36.505477image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum47939
5-th percentile60774.85
Q170765
median79076
Q389757.5
95-th percentile104823.8
Maximum136574
Range88635
Interquartile range (IQR)18992.5

Descriptive statistics

Standard deviation13664.51
Coefficient of variation (CV)0.16941249
Kurtosis0.12899636
Mean80658.221
Median Absolute Deviation (MAD)9278.5
Skewness0.49599901
Sum2.0164555 × 108
Variance1.8671884 × 108
MonotonicityNot monotonic
2022-12-06T18:04:36.760891image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
75637 3
 
0.1%
97268 3
 
0.1%
68063 3
 
0.1%
96928 2
 
0.1%
76461 2
 
0.1%
72953 2
 
0.1%
74431 2
 
0.1%
74336 2
 
0.1%
85054 2
 
0.1%
88634 2
 
0.1%
Other values (2414) 2477
99.1%
ValueCountFrequency (%)
47939 1
< 0.1%
48098 1
< 0.1%
49171 1
< 0.1%
49273 1
< 0.1%
49673 1
< 0.1%
50475 1
< 0.1%
50670 1
< 0.1%
50731 1
< 0.1%
50822 1
< 0.1%
51555 1
< 0.1%
ValueCountFrequency (%)
136574 1
< 0.1%
135455 1
< 0.1%
132035 1
< 0.1%
130913 1
< 0.1%
130071 1
< 0.1%
127033 1
< 0.1%
126963 1
< 0.1%
125949 1
< 0.1%
125697 1
< 0.1%
125214 1
< 0.1%

perimeter
Real number (ℝ)

Distinct2490
Distinct (%)99.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1130.279
Minimum868.485
Maximum1559.45
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size19.7 KiB
2022-12-06T18:04:37.002025image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum868.485
5-th percentile964.9201
Q11048.8297
median1123.672
Q31203.3405
95-th percentile1320.3929
Maximum1559.45
Range690.965
Interquartile range (IQR)154.51075

Descriptive statistics

Standard deviation109.25642
Coefficient of variation (CV)0.096663228
Kurtosis-0.021849642
Mean1130.279
Median Absolute Deviation (MAD)76.7115
Skewness0.41453885
Sum2825697.5
Variance11936.965
MonotonicityNot monotonic
2022-12-06T18:04:37.210081image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1206.002 2
 
0.1%
1014.49 2
 
0.1%
1187.56 2
 
0.1%
1023.719 2
 
0.1%
1192.175 2
 
0.1%
1134.729 2
 
0.1%
963.377 2
 
0.1%
1217.112 2
 
0.1%
1253.276 2
 
0.1%
1103.068 2
 
0.1%
Other values (2480) 2480
99.2%
ValueCountFrequency (%)
868.485 1
< 0.1%
871.458 1
< 0.1%
884.106 1
< 0.1%
888.242 1
< 0.1%
889.398 1
< 0.1%
895.169 1
< 0.1%
899.493 1
< 0.1%
899.532 1
< 0.1%
902.59 1
< 0.1%
903.456 1
< 0.1%
ValueCountFrequency (%)
1559.45 1
< 0.1%
1520.525 1
< 0.1%
1492.183 1
< 0.1%
1491.946 1
< 0.1%
1490.954 1
< 0.1%
1476.738 1
< 0.1%
1468.224 1
< 0.1%
1465.654 1
< 0.1%
1454.583 1
< 0.1%
1453.922 1
< 0.1%

major_axis_length
Real number (ℝ)

Distinct2499
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean456.60184
Minimum320.8446
Maximum661.9113
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size19.7 KiB
2022-12-06T18:04:37.494219image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum320.8446
5-th percentile376.25062
Q1414.95785
median449.4966
Q3492.73765
95-th percentile556.34869
Maximum661.9113
Range341.0667
Interquartile range (IQR)77.7798

Descriptive statistics

Standard deviation56.235704
Coefficient of variation (CV)0.12316136
Kurtosis-0.015689806
Mean456.60184
Median Absolute Deviation (MAD)38.3324
Skewness0.50297956
Sum1141504.6
Variance3162.4544
MonotonicityNot monotonic
2022-12-06T18:04:37.708901image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
465.7347 2
 
0.1%
539.6806 1
 
< 0.1%
584.4799 1
 
< 0.1%
514.3802 1
 
< 0.1%
424.5284 1
 
< 0.1%
561.5072 1
 
< 0.1%
473.3268 1
 
< 0.1%
607.8398 1
 
< 0.1%
441.2244 1
 
< 0.1%
622.8818 1
 
< 0.1%
Other values (2489) 2489
99.6%
ValueCountFrequency (%)
320.8446 1
< 0.1%
324.0113 1
< 0.1%
326.1485 1
< 0.1%
328.2684 1
< 0.1%
329.9696 1
< 0.1%
331.6936 1
< 0.1%
334.1895 1
< 0.1%
340.6951 1
< 0.1%
342.3154 1
< 0.1%
342.3836 1
< 0.1%
ValueCountFrequency (%)
661.9113 1
< 0.1%
648.9984 1
< 0.1%
648.4012 1
< 0.1%
640.1907 1
< 0.1%
632.2535 1
< 0.1%
632.108 1
< 0.1%
629.723 1
< 0.1%
625.3347 1
< 0.1%
623.0155 1
< 0.1%
622.8818 1
< 0.1%

minor_axis_length
Real number (ℝ)

Distinct2497
Distinct (%)99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean225.79492
Minimum152.1718
Maximum305.818
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size19.7 KiB
2022-12-06T18:04:38.020966image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum152.1718
5-th percentile188.26277
Q1211.24592
median224.7031
Q3240.67287
95-th percentile266.12709
Maximum305.818
Range153.6462
Interquartile range (IQR)29.42695

Descriptive statistics

Standard deviation23.297245
Coefficient of variation (CV)0.10317878
Kurtosis0.073234814
Mean225.79492
Median Absolute Deviation (MAD)14.5294
Skewness0.10430328
Sum564487.3
Variance542.7616
MonotonicityNot monotonic
2022-12-06T18:04:38.326221image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
221.2116 2
 
0.1%
220.6852 2
 
0.1%
229.4863 2
 
0.1%
220.2388 1
 
< 0.1%
227.7773 1
 
< 0.1%
200.5012 1
 
< 0.1%
226.9396 1
 
< 0.1%
203.6701 1
 
< 0.1%
232.3653 1
 
< 0.1%
197.3337 1
 
< 0.1%
Other values (2487) 2487
99.5%
ValueCountFrequency (%)
152.1718 1
< 0.1%
154.002 1
< 0.1%
154.5346 1
< 0.1%
154.7253 1
< 0.1%
155.4211 1
< 0.1%
156.1008 1
< 0.1%
160.6267 1
< 0.1%
162.796 1
< 0.1%
163.8458 1
< 0.1%
164.7038 1
< 0.1%
ValueCountFrequency (%)
305.818 1
< 0.1%
300.5777 1
< 0.1%
297.7952 1
< 0.1%
296.2779 1
< 0.1%
293.4921 1
< 0.1%
293.47 1
< 0.1%
292.9598 1
< 0.1%
292.6174 1
< 0.1%
292.53 1
< 0.1%
292.4844 1
< 0.1%

convex_area
Real number (ℝ)

Distinct2445
Distinct (%)97.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean81508.084
Minimum48366
Maximum138384
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size19.7 KiB
2022-12-06T18:04:38.677966image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum48366
5-th percentile61477.9
Q171512
median79872
Q390797.75
95-th percentile105956.45
Maximum138384
Range90018
Interquartile range (IQR)19285.75

Descriptive statistics

Standard deviation13764.093
Coefficient of variation (CV)0.16886782
Kurtosis0.12302642
Mean81508.084
Median Absolute Deviation (MAD)9346
Skewness0.49401595
Sum2.0377021 × 108
Variance1.8945025 × 108
MonotonicityNot monotonic
2022-12-06T18:04:38.995836image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
77745 3
 
0.1%
76640 3
 
0.1%
89255 2
 
0.1%
54979 2
 
0.1%
70045 2
 
0.1%
80649 2
 
0.1%
74727 2
 
0.1%
87868 2
 
0.1%
79445 2
 
0.1%
67537 2
 
0.1%
Other values (2435) 2478
99.1%
ValueCountFrequency (%)
48366 1
< 0.1%
48643 1
< 0.1%
49739 1
< 0.1%
50268 1
< 0.1%
50306 1
< 0.1%
51092 1
< 0.1%
51230 1
< 0.1%
51385 1
< 0.1%
51648 1
< 0.1%
52013 1
< 0.1%
ValueCountFrequency (%)
138384 1
< 0.1%
136373 1
< 0.1%
133706 1
< 0.1%
131934 1
< 0.1%
131713 1
< 0.1%
127906 1
< 0.1%
127781 1
< 0.1%
126962 1
< 0.1%
126538 1
< 0.1%
126196 1
< 0.1%

equiv_diameter
Real number (ℝ)

Distinct2424
Distinct (%)97.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean319.33423
Minimum247.0584
Maximum417.0029
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size19.7 KiB
2022-12-06T18:04:39.336208image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum247.0584
5-th percentile278.17431
Q1300.16798
median317.30535
Q3338.05737
95-th percentile365.32967
Maximum417.0029
Range169.9445
Interquartile range (IQR)37.8894

Descriptive statistics

Standard deviation26.89192
Coefficient of variation (CV)0.084212456
Kurtosis-0.14670252
Mean319.33423
Median Absolute Deviation (MAD)18.6827
Skewness0.27186759
Sum798335.58
Variance723.17535
MonotonicityNot monotonic
2022-12-06T18:04:39.654923image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
310.3289 3
 
0.1%
351.9168 3
 
0.1%
294.3816 3
 
0.1%
351.3012 2
 
0.1%
312.0147 2
 
0.1%
304.7731 2
 
0.1%
307.8449 2
 
0.1%
307.6484 2
 
0.1%
329.0807 2
 
0.1%
335.935 2
 
0.1%
Other values (2414) 2477
99.1%
ValueCountFrequency (%)
247.0584 1
< 0.1%
247.4677 1
< 0.1%
250.2128 1
< 0.1%
250.4722 1
< 0.1%
251.4868 1
< 0.1%
253.5089 1
< 0.1%
253.9981 1
< 0.1%
254.151 1
< 0.1%
254.3788 1
< 0.1%
256.2067 1
< 0.1%
ValueCountFrequency (%)
417.0029 1
< 0.1%
415.2911 1
< 0.1%
410.0149 1
< 0.1%
408.269 1
< 0.1%
406.954 1
< 0.1%
402.1734 1
< 0.1%
402.0626 1
< 0.1%
400.4538 1
< 0.1%
400.053 1
< 0.1%
399.2836 1
< 0.1%

eccentricity
Real number (ℝ)

Distinct1295
Distinct (%)51.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.8608794
Minimum0.4921
Maximum0.9481
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size19.7 KiB
2022-12-06T18:04:40.000135image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum0.4921
5-th percentile0.783195
Q10.8317
median0.8637
Q30.897025
95-th percentile0.924305
Maximum0.9481
Range0.456
Interquartile range (IQR)0.065325

Descriptive statistics

Standard deviation0.045167399
Coefficient of variation (CV)0.052466581
Kurtosis1.7942093
Mean0.8608794
Median Absolute Deviation (MAD)0.0327
Skewness-0.74862334
Sum2152.1985
Variance0.0020400939
MonotonicityNot monotonic
2022-12-06T18:04:40.371141image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.8915 7
 
0.3%
0.8987 7
 
0.3%
0.8834 7
 
0.3%
0.8495 7
 
0.3%
0.835 6
 
0.2%
0.8433 6
 
0.2%
0.8985 6
 
0.2%
0.8504 6
 
0.2%
0.8914 6
 
0.2%
0.8828 6
 
0.2%
Other values (1285) 2436
97.4%
ValueCountFrequency (%)
0.4921 1
< 0.1%
0.6586 1
< 0.1%
0.686 1
< 0.1%
0.688 1
< 0.1%
0.6903 1
< 0.1%
0.6915 1
< 0.1%
0.6944 1
< 0.1%
0.708 1
< 0.1%
0.7105 1
< 0.1%
0.7128 1
< 0.1%
ValueCountFrequency (%)
0.9481 1
< 0.1%
0.9464 1
< 0.1%
0.9457 1
< 0.1%
0.9448 1
< 0.1%
0.9443 1
< 0.1%
0.943 1
< 0.1%
0.9429 1
< 0.1%
0.9428 1
< 0.1%
0.942 1
< 0.1%
0.9415 1
< 0.1%

solidity
Real number (ℝ)

Distinct166
Distinct (%)6.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.9894916
Minimum0.9186
Maximum0.9944
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size19.7 KiB
2022-12-06T18:04:40.756589image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum0.9186
5-th percentile0.9841
Q10.9883
median0.9903
Q30.9915
95-th percentile0.9928
Maximum0.9944
Range0.0758
Interquartile range (IQR)0.0032

Descriptive statistics

Standard deviation0.0034935924
Coefficient of variation (CV)0.0035306943
Kurtosis81.121646
Mean0.9894916
Median Absolute Deviation (MAD)0.0015
Skewness-5.6910091
Sum2473.729
Variance1.2205188 × 10-5
MonotonicityNot monotonic
2022-12-06T18:04:41.145172image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.9905 62
 
2.5%
0.9912 61
 
2.4%
0.9916 57
 
2.3%
0.9911 56
 
2.2%
0.9906 55
 
2.2%
0.9918 54
 
2.2%
0.9904 54
 
2.2%
0.9909 52
 
2.1%
0.9913 51
 
2.0%
0.9914 50
 
2.0%
Other values (156) 1948
77.9%
ValueCountFrequency (%)
0.9186 1
< 0.1%
0.9542 1
< 0.1%
0.9567 1
< 0.1%
0.9582 1
< 0.1%
0.9639 1
< 0.1%
0.9661 1
< 0.1%
0.9699 1
< 0.1%
0.9702 1
< 0.1%
0.972 1
< 0.1%
0.9728 1
< 0.1%
ValueCountFrequency (%)
0.9944 1
 
< 0.1%
0.9943 1
 
< 0.1%
0.9939 2
 
0.1%
0.9938 6
 
0.2%
0.9937 8
 
0.3%
0.9936 12
0.5%
0.9935 8
 
0.3%
0.9934 12
0.5%
0.9933 6
 
0.2%
0.9932 22
0.9%

extent
Real number (ℝ)

Distinct1392
Distinct (%)55.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.69320452
Minimum0.468
Maximum0.8296
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size19.7 KiB
2022-12-06T18:04:41.431184image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum0.468
5-th percentile0.56768
Q10.6589
median0.71305
Q30.740225
95-th percentile0.7623
Maximum0.8296
Range0.3616
Interquartile range (IQR)0.081325

Descriptive statistics

Standard deviation0.060913648
Coefficient of variation (CV)0.087872549
Kurtosis0.42498155
Mean0.69320452
Median Absolute Deviation (MAD)0.03345
Skewness-1.0265683
Sum1733.0113
Variance0.0037104725
MonotonicityNot monotonic
2022-12-06T18:04:41.859232image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.7249 9
 
0.4%
0.7445 8
 
0.3%
0.7403 8
 
0.3%
0.7435 7
 
0.3%
0.7379 7
 
0.3%
0.7201 7
 
0.3%
0.7325 7
 
0.3%
0.7424 7
 
0.3%
0.7393 7
 
0.3%
0.7189 6
 
0.2%
Other values (1382) 2427
97.1%
ValueCountFrequency (%)
0.468 1
< 0.1%
0.4695 1
< 0.1%
0.4822 1
< 0.1%
0.4843 1
< 0.1%
0.4888 1
< 0.1%
0.495 1
< 0.1%
0.497 1
< 0.1%
0.4977 1
< 0.1%
0.5 1
< 0.1%
0.5005 1
< 0.1%
ValueCountFrequency (%)
0.8296 1
< 0.1%
0.7993 1
< 0.1%
0.7954 1
< 0.1%
0.7879 1
< 0.1%
0.7831 1
< 0.1%
0.7824 1
< 0.1%
0.7814 1
< 0.1%
0.781 1
< 0.1%
0.7808 1
< 0.1%
0.7801 1
< 0.1%

roundness
Real number (ℝ)

Distinct1480
Distinct (%)59.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.79153276
Minimum0.5546
Maximum0.9396
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size19.7 KiB
2022-12-06T18:04:42.253873image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum0.5546
5-th percentile0.6922
Q10.7519
median0.79775
Q30.834325
95-th percentile0.873715
Maximum0.9396
Range0.385
Interquartile range (IQR)0.082425

Descriptive statistics

Standard deviation0.055923947
Coefficient of variation (CV)0.070652725
Kurtosis-0.239235
Mean0.79153276
Median Absolute Deviation (MAD)0.04015
Skewness-0.37268712
Sum1978.8319
Variance0.0031274878
MonotonicityNot monotonic
2022-12-06T18:04:42.522484image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.7609 7
 
0.3%
0.806 6
 
0.2%
0.8267 6
 
0.2%
0.7749 6
 
0.2%
0.7933 6
 
0.2%
0.8357 6
 
0.2%
0.8028 6
 
0.2%
0.835 6
 
0.2%
0.7413 5
 
0.2%
0.781 5
 
0.2%
Other values (1470) 2441
97.6%
ValueCountFrequency (%)
0.5546 1
< 0.1%
0.5825 1
< 0.1%
0.6153 1
< 0.1%
0.6226 1
< 0.1%
0.627 1
< 0.1%
0.6327 1
< 0.1%
0.6338 1
< 0.1%
0.6374 1
< 0.1%
0.6391 1
< 0.1%
0.6426 1
< 0.1%
ValueCountFrequency (%)
0.9396 1
< 0.1%
0.9255 1
< 0.1%
0.9233 1
< 0.1%
0.9221 1
< 0.1%
0.9214 1
< 0.1%
0.9193 1
< 0.1%
0.9162 1
< 0.1%
0.9161 1
< 0.1%
0.916 1
< 0.1%
0.9156 1
< 0.1%

aspect_ration
Real number (ℝ)

Distinct2237
Distinct (%)89.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.0417023
Minimum1.1487
Maximum3.1444
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size19.7 KiB
2022-12-06T18:04:42.832521image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum1.1487
5-th percentile1.60829
Q11.80105
median1.9842
Q32.262075
95-th percentile2.620525
Maximum3.1444
Range1.9957
Interquartile range (IQR)0.461025

Descriptive statistics

Standard deviation0.31599688
Coefficient of variation (CV)0.15477128
Kurtosis-0.20336105
Mean2.0417023
Median Absolute Deviation (MAD)0.2177
Skewness0.54823109
Sum5104.2558
Variance0.099854031
MonotonicityNot monotonic
2022-12-06T18:04:43.199363image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1.8606 4
 
0.2%
1.8491 4
 
0.2%
1.7648 3
 
0.1%
1.7601 3
 
0.1%
1.8275 3
 
0.1%
2.6348 3
 
0.1%
1.9006 3
 
0.1%
1.8176 3
 
0.1%
2.2067 3
 
0.1%
1.806 3
 
0.1%
Other values (2227) 2468
98.7%
ValueCountFrequency (%)
1.1487 1
< 0.1%
1.329 1
< 0.1%
1.3744 1
< 0.1%
1.378 1
< 0.1%
1.3822 1
< 0.1%
1.3843 1
< 0.1%
1.3897 1
< 0.1%
1.4161 1
< 0.1%
1.421 1
< 0.1%
1.4259 1
< 0.1%
ValueCountFrequency (%)
3.1444 1
< 0.1%
3.0969 1
< 0.1%
3.0759 1
< 0.1%
3.051 1
< 0.1%
3.0374 1
< 0.1%
3.0041 1
< 0.1%
3.0017 1
< 0.1%
2.9988 1
< 0.1%
2.9789 1
< 0.1%
2.9665 1
< 0.1%

compactness
Real number (ℝ)

Distinct1405
Distinct (%)56.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.70412052
Minimum0.5608
Maximum0.9049
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size19.7 KiB
2022-12-06T18:04:43.561439image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum0.5608
5-th percentile0.6158
Q10.663475
median0.7077
Q30.7435
95-th percentile0.785605
Maximum0.9049
Range0.3441
Interquartile range (IQR)0.080025

Descriptive statistics

Standard deviation0.053066885
Coefficient of variation (CV)0.075366196
Kurtosis-0.50083343
Mean0.70412052
Median Absolute Deviation (MAD)0.0394
Skewness-0.062376578
Sum1760.3013
Variance0.0028160943
MonotonicityNot monotonic
2022-12-06T18:04:43.887537image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.7073 7
 
0.3%
0.7264 7
 
0.3%
0.7077 6
 
0.2%
0.7414 6
 
0.2%
0.7093 6
 
0.2%
0.7518 6
 
0.2%
0.6175 6
 
0.2%
0.7435 6
 
0.2%
0.6851 6
 
0.2%
0.7356 5
 
0.2%
Other values (1395) 2439
97.6%
ValueCountFrequency (%)
0.5608 1
< 0.1%
0.567 1
< 0.1%
0.5673 1
< 0.1%
0.5687 1
< 0.1%
0.5698 1
< 0.1%
0.5704 1
< 0.1%
0.5732 1
< 0.1%
0.5753 1
< 0.1%
0.5768 1
< 0.1%
0.5785 1
< 0.1%
ValueCountFrequency (%)
0.9049 1
< 0.1%
0.8665 1
< 0.1%
0.852 1
< 0.1%
0.8491 1
< 0.1%
0.8481 1
< 0.1%
0.8474 1
< 0.1%
0.8468 1
< 0.1%
0.8377 1
< 0.1%
0.8374 1
< 0.1%
0.8359 1
< 0.1%

class
Categorical

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size225.8 KiB
Çerçevelik
1300 
Ürgüp Sivrisi
1200 

Length

Max length13
Median length10
Mean length11.44
Min length10

Characters and Unicode

Total characters28600
Distinct characters15
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowÇerçevelik
2nd rowÇerçevelik
3rd rowÇerçevelik
4th rowÇerçevelik
5th rowÇerçevelik

Common Values

ValueCountFrequency (%)
Çerçevelik 1300
52.0%
Ürgüp Sivrisi 1200
48.0%

Length

2022-12-06T18:04:44.156255image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2022-12-06T18:04:44.421997image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
çerçevelik 1300
35.1%
ürgüp 1200
32.4%
sivrisi 1200
32.4%

Most occurring characters

ValueCountFrequency (%)
i 4900
17.1%
e 3900
13.6%
r 3700
12.9%
v 2500
8.7%
Ç 1300
 
4.5%
ç 1300
 
4.5%
l 1300
 
4.5%
k 1300
 
4.5%
Ü 1200
 
4.2%
g 1200
 
4.2%
Other values (5) 6000
21.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 23700
82.9%
Uppercase Letter 3700
 
12.9%
Space Separator 1200
 
4.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 4900
20.7%
e 3900
16.5%
r 3700
15.6%
v 2500
10.5%
ç 1300
 
5.5%
l 1300
 
5.5%
k 1300
 
5.5%
g 1200
 
5.1%
ü 1200
 
5.1%
p 1200
 
5.1%
Uppercase Letter
ValueCountFrequency (%)
Ç 1300
35.1%
Ü 1200
32.4%
S 1200
32.4%
Space Separator
ValueCountFrequency (%)
1200
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 27400
95.8%
Common 1200
 
4.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 4900
17.9%
e 3900
14.2%
r 3700
13.5%
v 2500
9.1%
Ç 1300
 
4.7%
ç 1300
 
4.7%
l 1300
 
4.7%
k 1300
 
4.7%
Ü 1200
 
4.4%
g 1200
 
4.4%
Other values (4) 4800
17.5%
Common
ValueCountFrequency (%)
1200
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 23600
82.5%
None 5000
 
17.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 4900
20.8%
e 3900
16.5%
r 3700
15.7%
v 2500
10.6%
l 1300
 
5.5%
k 1300
 
5.5%
g 1200
 
5.1%
p 1200
 
5.1%
1200
 
5.1%
S 1200
 
5.1%
None
ValueCountFrequency (%)
Ç 1300
26.0%
ç 1300
26.0%
Ü 1200
24.0%
ü 1200
24.0%

Interactions

2022-12-06T18:04:32.466692image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:06.444909image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:09.124557image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:11.243057image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:13.241505image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:15.277655image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:17.559683image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:19.751164image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:21.966157image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:24.115012image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:26.542635image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:29.064300image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:32.693817image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:06.775292image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:09.339528image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:11.391323image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:13.396675image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:15.457470image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:17.712763image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:19.967815image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:22.124254image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:24.296016image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:26.686303image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:29.297889image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:32.971370image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:06.990201image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:09.530956image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:11.561317image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:13.570205image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:15.689069image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:17.876237image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:20.139085image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:22.308969image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:24.475061image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:26.857068image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:29.535269image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:33.225027image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:07.238024image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:09.693125image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:11.716992image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:13.756644image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:16.046519image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:18.035130image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:20.327702image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:22.486673image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:24.661177image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:27.018483image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:29.826660image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:33.508117image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:07.572567image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:09.880238image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:11.901684image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:13.925582image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:16.232359image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:18.226714image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:20.529958image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:22.652000image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:25.109738image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:27.189382image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:30.128771image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:33.695762image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:07.775436image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:10.075480image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:12.060787image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:14.082151image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:16.403980image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:18.397258image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:20.690140image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:22.831754image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:25.275650image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:27.348084image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:30.459480image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:33.933302image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:07.977368image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:10.238218image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:12.229545image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:14.253545image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:16.569544image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:18.583346image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:20.877499image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:23.024500image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:25.411934image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:27.579643image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:30.724556image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:34.120264image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:08.136383image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:10.385201image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:12.371726image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:14.407116image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:16.719883image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:18.752664image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:21.046342image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:23.193438image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:25.563164image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:27.801721image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:30.967457image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:34.310117image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:08.328003image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:10.555420image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:12.549680image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:14.578077image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:16.893370image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:18.942861image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:21.244424image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:23.381344image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:25.747584image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:28.039169image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:31.322591image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:34.529755image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:08.522232image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:10.714905image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:12.732479image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:14.741257image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:17.065339image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:19.122620image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:21.452977image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:23.557928image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:25.939841image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:28.261882image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:31.667199image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:34.737436image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:08.722150image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:10.887816image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:12.891241image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:14.917438image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:17.230699image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:19.304521image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:21.634755image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:23.748666image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:26.143001image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:28.527494image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:31.949128image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:34.991874image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:08.917858image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:11.062514image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:13.053682image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:15.095851image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:17.395772image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:19.500780image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:21.784693image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:23.926507image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:26.355340image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:28.783352image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-06T18:04:32.204579image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Correlations

2022-12-06T18:04:44.623350image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Auto

The auto setting is an interpretable pairwise column metric of the following mapping:
  • Variable_type-Variable_type : Method, Range
  • Categorical-Categorical : Cramer's V, [0,1]
  • Numerical-Categorical : Cramer's V, [0,1] (using a discretized numerical column)
  • Numerical-Numerical : Spearman's ρ, [-1,1]
The number of bins used in the discretization for the Numerical-Categorical column pair can be changed using config.correlations["auto"].n_bins. The number of bins affects the granularity of the association you wish to measure.

This configuration uses the recommended metric for each pair of columns.
2022-12-06T18:04:45.040426image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-12-06T18:04:45.421475image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-12-06T18:04:45.825804image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-12-06T18:04:46.176328image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-12-06T18:04:35.371002image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
A simple visualization of nullity by column.
2022-12-06T18:04:36.066894image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

areaperimetermajor_axis_lengthminor_axis_lengthconvex_areaequiv_diametereccentricitysolidityextentroundnessaspect_rationcompactnessclass
056276888.242326.1485220.238856831267.68050.73760.99020.74530.89631.48090.8207Çerçevelik
1766311068.146417.1932234.228977280312.36140.82750.99160.71510.84401.78110.7487Çerçevelik
2716231082.987435.8328211.045772663301.98220.87490.98570.74000.76742.06510.6929Çerçevelik
366458992.051381.5638222.532267118290.88990.81230.99020.73960.84861.71460.7624Çerçevelik
466107998.146383.8883220.454567117290.12070.81870.98500.67520.83381.74130.7557Çerçevelik
5731911041.460405.8132231.426173969305.26980.82150.98950.71650.84801.75350.7522Çerçevelik
6733381020.055392.2516238.549473859305.57620.79380.99290.71870.88571.64430.7790Çerçevelik
7696921049.108421.4875211.770770442297.88360.86460.98940.67360.79571.99030.7067Çerçevelik
8957271231.609488.1199251.308696831349.11800.85730.98860.61880.79301.94230.7152Çerçevelik
9734651047.767413.6504227.264474089305.84070.83560.99160.74430.84091.82010.7394Çerçevelik
areaperimetermajor_axis_lengthminor_axis_lengthconvex_areaequiv_diametereccentricitysolidityextentroundnessaspect_rationcompactnessclass
249051555934.911401.8321164.703852013256.20670.91210.99120.71870.74122.43970.6376Ürgüp Sivrisi
2491698361010.605396.6286224.791870419298.19110.82390.99170.66930.85931.76440.7518Ürgüp Sivrisi
2492842361274.656456.9323237.154085248327.49440.85480.98810.61040.65151.92670.7167Ürgüp Sivrisi
249358987977.410404.0779186.371059518274.05220.88730.99110.73270.77592.16810.6782Ürgüp Sivrisi
2494797551146.431470.3888217.829680649318.66470.88630.98890.71750.76262.15940.6774Ürgüp Sivrisi
2495796371224.710533.1513190.436780381318.42890.93400.99070.48880.66722.79960.5973Ürgüp Sivrisi
2496696471084.318462.9416191.821070216297.78740.91010.99190.60020.74442.41340.6433Ürgüp Sivrisi
2497879941210.314507.2200222.187288702334.71990.89900.99200.76430.75492.28280.6599Ürgüp Sivrisi
2498800111182.947501.9065204.753180902319.17580.91300.98900.73740.71852.45130.6359Ürgüp Sivrisi
2499849341159.933462.8951234.559785781328.84850.86210.99010.73600.79331.97350.7104Ürgüp Sivrisi